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DEGREE OF OUTLIER CALCULATION DEVICE, AND PROBABILITY 
DENSITY ESTIMATION DEVICE AND HISTOGRAM CALCULATION 
DEVICE FOR USE THEREIN 

5 BACKGROUND OF THE T N VENT I ON 

FIELD OF THE INVENTION 

The present invention relates to a degree of 
outlier calculation device, and a probability density 
estimation device and a histogram calculation device for 
10 use therein and, more particularly, to statistical 

outlier detection, fraud detection and fraud detection 
techniques for detecting an abnormal value or an outlier 
which largely deviates from data patterns obtained so 
far from multi-dimensional time series data. 

15 DESCRIPTION OF THE RELATED ART 

Such a degree of outlier calculation device is 
for use in finding an abnormal value or an outlier which 
largely deviates from data patterns obtained so far from 
multi-dimensional time series data and is employed, for 
20 example, in a case of finding such fraud behavior as so- 

called cloning use from a record of cellular phone 
services and in a case of finding abnormal transaction 
from a use history of a credit card. 

Well-known conventional fraud detection methods 
25 using a machine learning technique include the method by 

T. Fawcett and F Provost ("Combining Data Mining and 
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Machine Learning for Effective Fraud Detection, 
Proceedings of AI Approaches to Fraud Detection and Risk 
Management, pp. 14-19, 1997") and the method by J. Ryan, 
M. Lin and R. Miikkulainen ("Intrusion Detection with 
5 Neural Networks, Proceedings of AI Approaches to Fraud 

Detection and Risk Management, pp. 72-77, 1997"). 

Among the above methods, one that makes use of an 
idea of statistical outlier detection, in particular, is 
the method by P. Burge and J. Shawe-Taylor ("Detecting 

10 Cellular Fraud Using Adaptive Prototypes, Proceedings of 

AI Approaches to Fraud Detection and Risk Management, pp. 
9-13, 1997"). 

As a learning algorithm for a parametric finite 
mixture model, well-known is the EM Algorithm by A. P. 

15 Dempster, N.M Laird and D.B. Ribin ("Maximum Likelihood 

from Incomplete Data via the EM Algorithm, Journal of 
the Royal Statistical Society, B, 39(1), pp. 1-38, 
1977" ) . 

As a learning algorithm for a normal kernel 
20 mixture distribution (a mixture of a finite number of 

the same normal distributions), the prototype updating 
algorithm by I. Grabec is known ( "Self -Organization of 
Neurons Described by the Maximum-Entropy Principle, 
Biological Cybernetics, vol. 63, pp. 403-409, 1990"). 
25 The above-described methods by T. Fawcett and F. 

Provost and by J. Ryan, M. Lin and R. Miikkulainen 
relate to fraud detection realized by learning unfair 
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detection patterns from data whose fraud is known (so- 
called supervised data). In practice, however, it is so 
difficult to obtain sufficient unfair data that highly 
precise learning can not be conducted to result in a 
5 decrease in fraud detection precision. 

The method by P. Burge and J. Shawe-Taylor 
relates to similar fraud detection based on unsupervised 
data. This method, however, conducts fraud detection 
with two non-parametric models, a short-term model and a 
10 long-term model, to make a distance between them as a 

criterion for an outlier. Statistical basis of the 
short-term model and the long-term model is insufficient 
to make statistical significance of a distance 
therebetween unclear. 
15 In addition, preparation of two models, short- 

term and long-term models, deteriorates calculation 
efficiency. Further problems are involved such as a 
problem that only continuous value data can be handled 
and not categorical data and a problem that since only 
20 non-parametric models are handled, fraud detection is 

unstable and inefficient. 

Although as a learning algorithm for a 
statistical model, the EM algorithm by A. P. Dempster, 
N.M. Laird and D.B. Ribin and the prototype updating 
25 algorithm by I. Grabec are known, since these algorithms 

learn from all the past data equally weighted , they 
fail to cope with a pattern change. 
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SUMMARY OF THE INVENTION 
An object of the present invention is to provide 
a degree of outlier calculation device capable of 
5 automatically detecting fraud based on data whose fraud 

is yet to be known (unsupervised data), and a 
probability density estimation device and a histogram 
calculation device for use therein. 

Another object of the present invention is to 
10 provide a degree of outlier calculation device which 

adopts an outlier determination criteria whose 
statistical significance is clear and uses a model 
including short-term and long-term models combined into 
one, thereby improving efficiency of calculation, coping 
15 with categorical data and enabling stable and efficient 

outlier detection using not only a non-parametric model 
but also a parametric model, and a probability density 
estimation device and a histogram calculation device for 
use therein. 

20 A further object of the present invention is to 

provide a degree of outlier calculation device which 
realizes in the device an algorithm learning while 
forgetting past data by weighting less on older data to 
enable even a change in pattern to be flexibly followed, 

25 and a probability density estimation device and a 

histogram calculation device for use therein. 

According to the first aspect of the invention, 



for use in a degree of outlier calculation device for 
sequentially calculating a degree of outlier of each 
data with a data sequence of real vector values as input, 
a probability density estimation device for, while 
5 sequentially reading the data sequence, estimating a 

probability distribution of the data in question by 
using a finite mixture of normal distributions (normal 
mixture for short), comprises 

probability calculation means for calculating, 

10 based on a value of input data and values of a mean 

parameter and a variance parameter of each of a finite 
number of normal distribution densities, a probability 
of generation of the input data in question from each 
normal distribution, and 

15 parameter rewriting means for updating and 

rewriting the stored parameter values while forgetting 
past data, according to newly read data based on a 
probability obtained by the probability calculation 
means, values of a mean parameter and a variance 

20 parameter of each normal distribution and a weighting 

parameter of each normal distribution. 

In the preferred construction, the probability 
density estimation device further comprises 

parameter storage means for storing values of a 

25 mean parameter and a variance parameter of each of a 

finite number of normal distribution densities and a 
weighting parameter of each normal distribution, wherein 
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the parameter rewriting means updates and 
rewrites data of the parameter storage means. 

According to the second aspect of the invention, 
a degree of outlier calculation device for sequentially 
5 detecting a degree of outlier of each data with a data 

sequence of real vector values as input, comprises 

a probability density estimation device for, 
while sequentially reading the data sequence, estimating 
a probability distribution of generation of the data in 
10 question by using a finite mixture of normal 

distributions including 

(a) parameter storage means for storing values of 
a mean parameter and a variance parameter of each of a 
finite number of normal distribution densities and a 

15 weighting parameter of each normal distribution, 

(b) probability calculation means for calculating, 
based on a value of input data and values of a mean 
parameter and a variance parameter of each of a finite 
number of normal distribution densities, a probability 

20 of generation of the input data in question from each 

normal distribution, and 

(c) parameter rewriting means for updating and 
rewriting the stored parameter values while forgetting 
past data, according to newly read data based on a 

25 probability obtained by the probability calculation 

means, values of a mean parameter and a variance 
parameter of each normal distribution and a weighting 



parameter of each normal distribution, and 

degree of outlier calculation means for 
calculating and outputting a degree of outlier of the 
data by using a parameter of the normal mixture updated 
5 by the probability density estimation device and based 

on a probability distribution estimated from values of 
the parameters before and after the updating and the 
input data. 

According to the third aspect of the invention, a 
10 probability density estimation device for use in a 

degree of outlier calculation device to, while 
sequentially reading a data sequence, estimate a 
probability distribution of generation of the data in 
question by using a finite number of normal kernel 
15 distributions, comprises 

parameter storage means for storing a value of a 
parameter indicative of a position of each kernel, and 

parameter rewriting means for reading a value of 
a parameter from the storage means and updating the 
20 stored parameter values while forgetting past data, 

according to newly read data to rewrite the contents of 
the parameter storage means . 

According to another aspect of the invention, a 
degree of outlier calculation device for sequentially 
25 calculating a degree of outlier of each data with a data 

sequence of real vector values as input, comprises 

a probability density estimation device for, 



while sequentially reading the data sequence, estimating 
a probability distribution of generation of the data in 
question by using a finite number of normal kernel 
distributions including 

(a) parameter storage means for storing a value 
of a parameter indicative of a position of each kernel, 
and 

(b) parameter rewriting means for reading a value 
of a parameter from the storage means and updating the 
stored parameter values while forgetting past data, 
according to newly read data to rewrite the contents of 
the parameter storage means, and 

degree of outlier calculation means for 
calculating and outputting a degree of outlier of the 
data by using the parameter updated by the probability 
density estimation device and based on a probability 
distribution estimated from values of the parameters 
before and after the updating and the input data. 

According to another aspect of the invention, for 
use in a degree of outlier calculation device for 
sequentially calculating a degree of outlier of each 
data with discrete value data as input, a histogram 
calculation device for calculating a parameter of a 
histogram with respect to the discrete value data 
sequentially input, comprises 

storage means for storing a parameter value of 
the histogram, and 
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parameter updating means for reading the 
parameter value from the storage means and updating past 
parameter values while forgetting past data based on 
input data to rewrite the value of the storage means , 
5 thereby outputting some of parameter values of the 

storage means . 

According to another aspect of the invention, a 
degree of outlier calculation device for sequentially 
calculating a degree of outlier of each data with 
10 discrete value data as input, comprises 

a histogram calculation device for calculating a 
parameter of a histogram with respect to the discrete 
value data sequentially input including 

storage means for storing a parameter value of 
15 the histogram, and 

parameter updating means for reading the 
parameter value from the storage means and updating past 
parameter values while forgetting past data based on 
input data to rewrite the value of the storage means, 
20 thereby outputting some of parameter values of the 

storage means, and 

score calculation means for calculating, based on 
the output of the histogram calculation device and the 
input data, a score of the input data in question with 
25 respect to the histogram, thereby outputting the output 

of the score calculation means as a degree of outlier of 
the input data. 
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According to another aspect of the invention, a 
degree of outlier calculation device for calculating a 
degree of outlier with respect to sequentially input 
data which is described both in a discrete value and in 
5 a continuous value , comprises 

a histogram calculation device for estimating a 
histogram with respect to a discrete value data part, 

probability density estimation devices provided 
as many as the number of cells of the histogram for 
10 estimating a probability density with respect to a 

continuous value data part, 

cell determination means for determining to which 
cell of the histogram the discrete value data part 
belongs to send the continuous data part to the 
15 corresponding one of the probability density estimation 

devices, and 

score calculation means for calculating a score 
of the input data based on a probability distribution 
estimated from output values of the histogram 
20 calculation device and the probability density 

estimation device and the input data, thereby 

outputting the output of the score calculation 
means as a degree of outlier of the input data, 

the histogram calculation device including 
25 storage means for storing a parameter value of 

the histogram, and 

parameter updating means for reading the 
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parameter value from the storage means and updating past 
parameter values while forgetting past data based on 
input data to rewrite the value of the storage means, 
thereby outputting some of parameter values of the 
storage means, and 

the probability density estimation device 
including 

parameter storage means for storing values of a 
mean parameter and a variance parameter of each of a 
finite number of normal distribution densities and a 
weighting parameter of each normal distribution, 

probability calculation means for calculating, 
based on a value of input data, and values of a mean 
parameter and a variance parameter of each of a finite 
number of normal distribution densities, a probability 
of generation of the input data in question from each 
normal distribution, and 

parameter rewriting means for updating and 
rewriting the stored parameter values while forgetting 
past data, according to newly read data based on a 
probability obtained by the probability calculation 
means, values of a mean parameter and a variance 
parameter of each normal distribution and a weighting 
parameter of each normal distribution. 

According to another aspect of the invention, a 
degree of outlier calculation device for calculating a 
degree of outlier with respect to sequentially input 
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data which is described both in a discrete value and in 
a continuous value , comprises 

a histogram calculation device for estimating a 
histogram with respect to the discrete value data part, 
5 probability density estimation devices provided 

as many as the number of cells of the histogram for 
estimating a probability density with respect to a 
continuous value data part, 

cell determination means for determining to which 
1 0 cell of the histogram the discrete value data part 

belongs to send the continuous data part to the 
corresponding one of the probability density estimation 
devices, and 

score calculation means for calculating a score 
15 of the input data based on a probability distribution 

estimated from output values of the histogram 
calculation device and the probability density 
estimation device and the input data, thereby 

outputting the output of the score calculation 
20 means as a degree of outlier of the input data, 

the histogram calculation device including 

storage means for storing a parameter value of 
the histogram, and 

parameter updating means for reading the 
25 parameter value from the storage means and updating past 

parameter values while forgetting past data based on 
input data to rewrite the value of the storage means, 



- 13 - 



thereby outputting some of parameter values of the 
storage means, and 

the probability density estimation device 
including 

5 parameter storage means for storing a value of a 

parameter indicative of a position of each kernel, and 

parameter rewriting means for reading a value of 
a parameter from the storage means and updating the 
stored parameter values while forgetting past data, 

10 according to newly read data to rewrite the contents of 

the parameter storage means. 

According to another aspect of the invention, for 
use in a degree of outlier calculation device for 
sequentially calculating a degree of outlier of each 

15 data with a data sequence of real vector values as input, 

a probability density estimation method of, while 
sequentially reading the data sequence, estimating a 
probability distribution of generation of the data in 
question by using a finite mixture of normal 

20 distributions, comprising the steps of 

based on values of a mean parameter and a 
variance parameter of each of a finite number of normal 
distribution densities read from parameter storage means 
for storing a value of input data, values of a mean 

25 parameter and a variance parameter of each of a finite 

number of normal distribution densities, and a weighting 
parameter of each normal distribution, calculating a 
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probability of generation of the input data in question 
from each normal distribution, and 

updating the stored parameter values while 
forgetting past data, according to newly read data based 
5 on a probability obtained by the probability calculation 

means, values of a mean parameter and a variance 
parameter of each normal distribution and a weighting 
parameter of each normal distribution to rewrite data of 
the parameter storage means. 

10 According to another aspect of the invention, a 

degree of outlier calculation method of sequentially 
calculating a degree of outlier of each data, with a 
data sequence of real vector values as input, wherein 
probability density estimation for, while 

15 sequentially reading the data sequence, estimating a 

probability distribution of generation of the data in 
question by using a finite mixture of normal 
distributions, comprises the steps of: 

based on values of a mean parameter and a 

20 variance parameter of each of a finite number of normal 

distribution densities read from parameter storage means 
for storing a value of input data, values of a mean 
parameter and a variance parameter of each of a finite 
number of normal distribution densities, and a weighting 

25 parameter of each normal distribution, calculating a 

probability of generation of the input data in question 
from each normal distribution, and 
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updating the stored parameter values while 
forgetting past data, according to newly read data based 
on a probability obtained by the probability calculation 
means, values of a mean parameter and a variance 
5 parameter of each normal distribution and a weighting 

parameter of each normal distribution to rewrite data of 
the parameter storage means, and which further comprises 
the step of: 

calculating and outputting a degree of outlier of 
10 the data by using a parameter of the normal mixture 

updated by the probability density estimation and based 
on a probability distribution estimated from values of 
the parameters before and after the updating and the 
input data. 

15 According to another aspect of the invention, a 

probability density estimation method for use in 
calculation of a degree of outlier to, while 
sequentially reading a data sequence, estimate a 
probability distribution of generation of the data in 

20 question by using a finite number of normal kernel 

distributions, comprising the steps of: 

storing a value of a parameter indicative of a 
position of each kernel in parameter storage means, and 
reading a value of a parameter from the storage 

25 means and updating the stored parameter values while 

forgetting past data, according to newly read data to 
rewrite the contents of the parameter storage means . 
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According to another aspect of the invention, a 
degree of outlier calculation method of sequentially 
calculating a degree of outlier of each data, with a 
data sequence of real vector values as input, wherein 

probability density estimation for, while 
sequentially reading the data sequence, estimating a 
probability distribution of generation of the data in 
question by using a finite number of normal kernel 
distributions comprises the steps of: 

storing a value of a parameter indicative of a 
position of each kernel in parameter storage means, 

reading a value of a parameter from the storage 
means and updating the stored parameter values while 
forgetting past data, according to newly read data to 
rewrite the contents of the parameter storage means, and 
which further comprises: 

degree of outlier calculation means for 
calculating and outputting a degree of outlier of the 
data by using the parameter updated by the probability 
density estimation and based on a probability 
distribution estimated from values of the parameters 
before and after the updating and the input data. 

According to another aspect of the invention, for 
use in calculation of a degree of outlier for 
sequentially calculating a degree of outlier of each 
data with discrete value data as input, a histogram 
calculation method of calculating a parameter of a 
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histogram with respect to the discrete value data 
sequentially input, comprising the steps of: 

reading the parameter value from storage means 
for storing a parameter value of the histogram and 
5 updating past parameter values while forgetting past 

data based on input data to rewrite the value of the 
storage means, and 

outputting some of parameter values of the 
storage means . 

10 According to a further aspect of the invention, a 

degree of outlier calculation device for sequentially 
calculating a degree of outlier of each data with 
discrete value data as input, comprising: 

a histogram calculation device for calculating a 

15 parameter of a histogram with respect to the discrete 

value data sequentially input including 

storage means for storing a parameter value of 
the histogram, and 

parameter updating means for reading the 

20 parameter value from the storage means and updating past 

parameter values while forgetting past data based on 
input data to rewrite the value of the storage means, 
thereby outputting some of parameter values of the 
storage means , and 

25 score calculation means for calculating, based on 

the output of the histogram calculation device and the 
input data, a score of the input data in question with 
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respect to the histogram, thereby outputting the score 
calculation result as a degree of outlier of the input 
data. 

According to a still further aspect of 
5 the invention, a degree of outlier calculation method of 

calculating a degree of outlier with respect to 
sequentially input data which is described both in a 
discrete value and in a continuous value , wherein 

histogram calculation which estimates a histogram 
10 with respect to a discrete value data part comprises the 

steps of: 

reading the parameter value from storage means 
for storing a parameter value of the histogram and 
updating past parameter values while forgetting past 
15 data based on input data to rewrite the value of the 

storage means, and 

outputting some of parameter values of the 
storage means, and wherein 

in probability density estimation devices 
20 provided as many as the number of cells of the histogram 

for estimating a probability density with respect to a 
continuous value data part, the method comprises the 
steps of : 

based on values of a mean parameter and a 
25 variance parameter of each of a finite number of normal 

distribution densities read from parameter storage means 
for storing a value of input data, values of a mean 
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parameter and variance parameter of each of a finite 
number of normal distribution densities and a weighting 
parameter of each normal distribution, calculating a 
probability of generation of the input data in question 
5 from each normal distribution, and 

based on a probability obtained by the 
probability calculation means, values of a mean 
parameter and a variance parameter of each normal 
distribution and a weighting parameter of each normal 
10 distribution, updating the stored parameter values while 

forgetting past data, according to newly read data to 
rewrite the data of the parameter storage means, and 
wherein the method further comprises the steps of : 

determining to which cell of the histogram the 
15 discrete value data part belongs to send the continuous 

data part to the corresponding one of the probability 
density estimation devices, 

calculating a score of the input data based on a 
probability distribution estimated from output values of 
20 the histogram calculation device and the probability 

density estimation device and the input data, and 

outputting the score calculation result as a 
degree of outlier of the input data. 

In the present invention, with one value of time 
25 series data as x, assuming that input data is multi- 

divisional data, the contents of x include, for example, 
one real number, an attribute of a discrete value of a 
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multi-divisional real number value vector and a multi- 
divisional vector having the foregoing elements. In a 
case of cellular phone, x may be expressed as follows 
which is one example only: 
5 x = (telephone service start time, telephone 

service duration time and origin of service) 

A probability density function of a probability 
distribution followed by x represents character of a 
data generation mechanism (e.g. telephone service 

10 pattern of user). The degree of outlier calculation 

device according to the present invention learns a 
probability density function every time time series data 
is applied. Under these circumstances, it is assumed 
that a "degree of outlier" is basically calculated based 

15 on the two ideas (A) and (B) shown below. 

A) A degree of outlier of one input data is calculated 
based on the amount of a change in a learned probability 
density from that before learning caused as a result of 
taking in the input data. This is on the premise that 

20 data largely differing in tendency from a learned 

probability density function is considered to have a 
high degree of outlier. More specifically, a function of 
a distance between probability densities before and 
after data input is calculated as a degree of outlier. 

25 B) A likelihood of a probability density function so far 

obtained by learning with respect to input data is 
calculated (value of the probability density function 
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with respect to the input data) . It can be understood 
that larger the likelihood is, the higher the degree of 
outlier is. In practice, a value obtained by adding a 
negative sign to a logarithm of the likelihood (negative 
5 logarithmic likelihood) is output as a degree of outlier. 

In addition, a combination of the above two 
functions and the like can be used. As described in the 
foregoing, the device according to the present invention 
represents statistical character of a data generation 

10 mechanism by a probability density function (the 

function of a probability density estimation device) and 
based thereon, calculates and outputs how input data 
deviates from the character of the data generation 
mechanism as a "degree of outlier" (the function of the 

15 degree of outlier calculation device). 

Other objects, features and advantages of the 
present invention will become clear from the detailed 
description given herebelow. 



20 BRIEF DESCRIPTION OF THE DRAWINGS 

The present invention will be understood more 
fully from the detailed description given herebelow and 
from the accompanying drawings of the preferred 
embodiment of the invention, which, however, should not 
25 be taken to be limitative to the invention, but are for 

explanation and understanding only. 
In the drawings: 
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Fig. 1 is a diagram showing a structure of one 
example of a probability density estimation device 
(normal mixture) according to the present invention; 

Fig. 2 is a flow chart showing operation of the 
5 device illustrated in Fig. 1; 

Fig. 3 is a diagram showing a structure of an 
example of a degree of outlier calculation device using 
the device of Fig. 1; 

Fig. 4 is a flow chart of operation of the device 
10 illustrated in Fig. 3; 

Fig. 5 is a diagram showing a structure of one 
example of a probability density estimation device 
(kernel mixture) according to the present invention; 

Fig. 6 is a flow chart of operation of the device 
15 illustrated in Fig. 5; 

Fig. 7 is a diagram showing a structure of an 
example of a degree of outlier calculation device using 
the device of Fig. 6; 

Fig. 8 is a flow chart of operation of the device 
20 illustrated in Fig. 7; 

Fig. 9 is a diagram showing a structure of one 
example of a histogram calculation device according to 
the present invention; 

Fig. 10 is a flow chart of operation of the 
25 device illustrated in Fig. 9; 

Fig. 11 is a diagram showing a structure of an 
example of a degree of outlier calculation device using 
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the device of Fig. 10; 

Fig. 12 is a flow chart of operation of the 
device illustrated in Fig. 11; 

Fig. 13 is a diagram showing a structure of an 
5 example of a degree of outlier calculation device using 

the devices of Figs. 1 and 9; 

Fig. 14 is a flow chart of operation of the 
device illustrated in Fig. 13; 

Fig. 15 is a diagram showing a structure of an 
10 example of a degree of outlier calculation device using 

the devices of Figs. 5 and 9; 

Fig. 16 is a flow chart of operation of the 
device illustrated in Fig. 15. 

15 DESCRIPTION OF THE PREFERR ED EMBODIMENT 

The preferred embodiment of the present invention 
will be discussed hereinafter in detail with reference 
to the accompanying drawings. In the following 
description, numerous specific details are set forth in 

20 order to provide a thorough understanding of the present 

invention. It will be obvious, however, to those skilled 
in the art that the present invention may be practiced 
without these specific details. In other instance, well- 
known structures are not shown in detail in order to 

25 unnecessary obscure the present invention. 

First, description will be made of a probability 
density estimation device using a normal mixture. Assume 
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that data x (d-dimensional vector value) is generated 
according to the following Expression 1 as a probability 
distribution: 

k 

5 p(x|8) =2)c i p(x|^i i ,2 i ) (1) 

In the expression, holds the following: 



and fi t denotes a n-dimensional vector which is a 
parameter indicative of a mean value of an n-dimensional 
normal distribution and 2^ denotes an n-dimensional 
square matrix which is a parameter indicative of a 
variance of the n-dimensional normal distribution. c L 
denotes a parameter indicative of a weight of a normal 
distribution. Here, k represents an integer indicative 
of the number of overlaps and holds the following: 



= 0 and 



2'- 



It is also assumed that 9 = (c L , tt ir 2 if c k/ M k , 2 k ) 

represents a parameter vector. 

Fig. 1 is a block diagram showing a probability 
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density estimation device according to one embodiment of 
the present embodiment. Assume here that a constant r (0 
^r^l and the smaller r becomes, the faster past data is 
forgotten) indicative of a forgetting speed and k as the 
5 number of overlaps of normal distributions are given in 

advance. In addition, the parameter a (a>Q) is also 
used which is assumed to be given in advance. 

In Fig. 1, a parameter storage device 13 is a 
device for storing the above-described parameter 9 , a 

10 parameter rewriting device 12 is capable of storing a d- 

dimensional vector li L ' and a d-dimensional square matrix 
Ei' as well. The reference numeral 10 represents a data 
input unit, 11 a probability calculation device for 
calculating a probability and 14 a parameter output unit. 

15 Fig. 2 is a flow chart showing schematic 

operation of the block illustrated in Fig. 1 and the 
device of Fig. 1 operates in a manner as described in 
the following. First, initialize a value of each 
parameter stored in the parameter storage device 13 

20 before data reading {Step S10). Next, the device 

operates in the following manner every time t-th data x t 
is input. The input x t is transferred to and stored in 
the probability calculation device 11 and the parameter 
rewriting device 12 (Step 11). 

25 The probability calculation device 11 reads a 

current value 9 of the parameter from the parameter 
storage device 13, based on the value, calculates each 
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probability T ± (i = 1, 2, k) that each normal 

distribution generates the data x t according to the 
following [Expression 4] (Step S12) and sends the 
calculation result to the parameter rewriting device 12: 



c iP< x t Mi'Si) 

Yi>(l-ar)- 



JV^CiPCxtlni.Zi) k 



The parameter rewriting device 12 reads the current 
parameter value from the parameter storage device 13 

10 while sequentially calculating an updating result of the 

parameter value with respect to each of i = 1, 2, k 
in a manner as shown in the following expressions (2) to 
(6) by using the received probability T L to rewrite the 
parameter values stored in the parameter storage device 

15 13 (Step S13). In these expressions (2) to (6), the sign 

":=" signifies that a right-side term is to substitute 
for a left-side term. 



Ci :=(l-r)Ci +rYi 
\x i :=(l-r)n i ' + ry { -x t 
.. . 



(2) 

(3) 



-(4) 



-(l-r)2 i ' + ry i -x^t (5) 

.-^-Wl (6) 



Then, the parameter storage device 13 outputs the 
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rewritten parameter values (Step S14). The updating rule 
is equivalent to maximization of a logarithmic 
likelihood having a weight of (1-r) 1 with respect to the 
(t-l)th data and realizes such estimation as made by 
5 forgetting past data one by one. This accordingly 

results in learning using latest 1/r number of data (1: 
positive integer) . 

This is because a solution of (1-r) 1 = 1/2 is 
expressed as: 
10 1 = - (log2)/log(l-r) ~ (log2)/r 

Thus, the probability density expressed by the 
above Expression (1) and the function is completely 
designated by a finite number of parameters. Therefore, 

15 only the designation of a parameter value is enough for 

expressing the present probability density function, so 
that the parameter output unit 14 illustrated in Fig. 1 
enables estimation of the probability density function 
in question. A device for calculating a degree of 

20 outlier of input data using thus estimated probability 

density function is shown in the block diagram of Fig. 3. 

Fig. 3 is a block diagram showing one embodiment 
of a degree of outlier calculation device. The present 
device includes an input unit 20, a probability density 

25 estimation device 21 illustrated in Fig. 1, a score 

calculation device 22 for calculating a degree of 
outlier of data, that is, a score, based on a 
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probability distribution estimated from input data and a 
parameter from the probability density estimation device 
21, and an output unit 23 for outputting the calculation 
result. The device shown in Fig. 3 operates in the 
5 following manner according to a flow chart of Fig. 4 

every time t-th data x t is input. 

The input x t is transferred to the probability 
density estimation device 21 (normal mixture) and the 
score calculation device 22 (Step S20) and stored 

10 therein. The probability density estimation device 21 

updates a value of a stored parameter according to the 
input data (Step S21) and inputs the new value to the 
score calculation device 22. The score calculation 
device 22 calculates a score using the input data, the 

15 parameter value and the parameter value handed over in 

the past (Step S22) and outputs the same (Step S23). A 
score indicative of a degree of outlier is calculated, 
for example, using a square distance, a Hellinger 
distance and further a logarithmic loss. 

20 In the following, the calculation will be 

described more specifically. In a case where with a 
parameter 9 (t) estimated from data x t = x L x 2 .... x t , the 
expression p (t) (x) = p (x|0(t)) holds and with respect 
to probability distributions p and q, d s (p, q) 

25 represents a square distance between the two 

distributions and dh(p,q) represents a Hellinger 
distance, any of the followings can be used as a score: 



d s (P (t) , ) = j(p (t) (x) - p"-" (x)) 2 dx 

d h (P (t) , ) =j(Vp (t> ( x ) " Vp (t_1) ( x )) 2 dx 



A logarithmic loss can be calculated by the following 
expression: 

- logp (t - l)(xt) 
These can be immediately generalized into ds (p (t> , p (t_T> ) 
etc. with T as a positive integer. 

Next, another embodiment of a probability density 
estimation device according to the present invention 
will be described. In this example, used as a data 
generation model is the following expression which is a 
kernel mixture distribution: 

1 k 

p(x|q) = — y <*>(*: q A ) 

In the expression, " 0) ( ■ : ■ ) is called a kernel function 
which is provided in the form of the following normal 
density function (referred to as normal distribution 
kernel ) : 
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co(x : to t ) * expf -±(x - q i ) T 2- 1 (x - q t )) 

In the expression, 2 represents a diagonal matrix and 
the following equation holds: 

5 2 = diag ( a 2 , . . . , o 2 ) 

O represents an applied positive integer. Each q ± 
denotes a d-dimensional vector which is a parameter 
designating a position of each kernel function, {q L y is 
called prototype. x m represents an m-th component of x. 

10 Similarly, qim represents an m-th component of q L . 

Fig. 5 is a block diagram showing a probability 
density estimation device using a kernel mixture 
distribution. A parameter storage device 32 has a 
function of storing q = (q lf q 2 , ••• q k ) • In Fi 9- 5 ' 30 

15 denotes an input unit, 31 a parameter rewriting device 

and 33 an output unit. The device shown in Fig. 5 
operates in the following manner according to a flow 
chart of Fig. 6. First, prior to data reading, 
initialize a parameter value stored in the parameter 

20 storage device 32 (Step S30). Then, every time t-th data 

x t is input, the device operates according to the 
following procedures. The input x t is transferred to the 
parameter rewriting device 31 (Step S31) and stored 
therein. The parameter rewriting device 31 reads a 

25 current parameter value q from the parameter storage 

device 32 and obtains a solution Aq of the following 
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simultaneous linear equations (k = 1, 2, . .., K, 1 - 1, 

2, , d) (5ml represents a Kronecker delta, that is, 

when m = 1, it equals 1 and otherwise equals 0) to 
rewrite, as q := q+Aq, the parameter value stored in the 
parameter storage device 32 (Step S32): 

|)2 C ^ klAqj ' m=rBkl (7> 

j = lm = l 

however 

, U t+1 -q k | 2 <jk . f [qj -q k |\ 

B kl - K - (x t + 1,1 - q u )exp(-J — ) - > (q 0 - q k i )exp( — ) 

4a 4or 

r (q k i-qji)(qk m -qj m ) . ( 1^-^' 

1 2a 4a 

The parameter storage device 32 outputs the rewritten 
parameter value (Step S33). 

In the foregoing updating rules, r denotes a 
parameter which controls a forgetting speed. More 
specifically, a kernel mixture distribution obtained by 
sequentially adapting the rules in question minimizes a 
square distance from a probability density expressed as 
the following expression: 

^ r(l-r) t -"w(x:x T ) + (l-r) t - 1 w(x:x 1 ) (8) 



- 32 - 



The algorithm by I. Grabec adopted by P. Burge and J. 
Shawe-Taylor corresponds to the above expression with r 
as a constant replaced by 1/ T . In this case, an 
expression corresponding to Expression (8) will be 
simply expressed as: 




An example of a degree of outlier calculation 
device for calculating a degree of outlier of input data 
using a parameter obtained from the probability density 
estimation device employing a kernel mixture 
distribution shown in Fig. 5 is illustrated in Fig. 7. 
In Fig. 7, 40 represents an input unit, 41 the 
probability density estimation device shown in Fig. 5, 
42 a score calculation device and 43 an output unit. 

The device illustrated in Fig. 7 operates 
according to the following procedures and a flow chart 
of Fig. 8 every time t-th data x t is input. The input x t 
is transferred to the probability density estimation 
device 41 (kernel mixture distribution) and the score 
calculation device 42 (Step S40) and stored therein. The 
probability density estimation device 41 updates a value 
of a stored parameter according to the input data and 
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supplies the new value to the score calculation device 
42 . The score calculation device 42 calculates a score 
using the input data, the value of the parameter and 
values of parameters handed over in the past and outputs 
5 the same (Steps S42 and S43). In this case, the same 

score function as that in the degree of outlier 
calculation device shown in Fig. 3 can be used. 

Fig. 9 is a diagram showing an entire structure 
of a histogram calculation device according to the 

10 present invention. Discrete value data is sequentially 

input to a parameter updating device 51 to which a 
histogram storage device 52 is connected which stores a 
parameter value of a histogram and outputs the same. 50 
represents an input unit and 53 represents an output 

1 5 unit . 

Fig. 10 is a flow chart showing operation of the 
device illustrated in Fig. 9. Assume that discrete value 
data is designated by a number n of variables. Assume 
here that an n-dimensional data space is divided into a 
20 number N of exclusive cells in advance and that a 

histogram is formed on these cells. Histogram represents 
a probability distribution with (p lf ... p N ) as a 
parameter . 

Here, Pj satisfies the following equation. 

25 
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Here, p i represents an occurrence probability of 
a j-th cell. Assume that TO (j) = 0 (j = 1, . .., N), 
0<r<l and j8 > 0 are given numbers and that initial 
5 parameters are as follows (Step S50): 

P(0) (1) = = p(0) (N) = 1/N 



The parameter updating device 51 conducts 
updating with respect to t-th input data [Step S51] in 
10 the following manner (Step S52): 



T t (j)-(l-r)T t _ 1 (j) + & t (j) 



P (t) (3) = 



T t (j) + f 



(l-(l-r) m )/r + Np 



In the expression , 5 t (j) takes 1 when the t-th data is 
15 input to the j-th cell and otherwise takes 0. This 

updating is conducted with respect to all the cells. 

With p (t) (1), p (t) (N) as new parameters of 

the histogram, updating is conducted. These values are 

sent to the histogram storage device 52. The histogram 
20 storage device 52 stores several past parameter values 

and outputs a part of them (Step S53). 

The parameter updating device 51 conducts 

calculation at each step by multiplying data as of time 

t before by a weight of (l-r)* . The weighting indicates 
25 that the older the data is, the more gradually it is 
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forgotten and realizes in the device an algorithm 
learning while forgetting. As a result, it is possible 
to flexibly follow a change of a user pattern. 

A histogram represents a probability distribution 
5 on a categorical variable and expresses, similarly to a 

probability density function on a continuous variable, 
statistical character of a data generation mechanism. 
Accordingly, a relationship between the "histogram 
calculation device" and the "degree of outlier 

10 calculation device" is completely the same as that 

between the above-described "probability density 
estimation device" and "degree of outlier calculation 
device". More specifically, the "histogram calculation 
device" expresses statistical calculation of the data 

15 generation mechanism based on which the "degree of 

outlier calculation device" calculates how much input 
data deviates from character of the data generation 
mechanism as a "degree of outlier". 

Fig. 11 shows an entire structure of a degree of 

20 outlier calculation device using the histogram 

calculation device illustrated in Fig. 9, and Fig. 12 
shows a flow chart of the operation of the device. 
Discrete value data from an input unit 60 is 
sequentially input to a histogram calculation device 61 

25 and a score calculation device 62 (Step S61). The score 

calculation device 62 is connected to the histogram 
calculation device 61 which outputs a parameter value of 
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the histogram from the input data (Step S62 ) and sends 
the same to the score calculation device 62. With the 
input data and the output of the histogram calculation 
device 61 as inputs, the score calculation device 62 
5 calculates a score of a degree of outlier of the input 

data (Step S63) . 

As a score calculation method in this case, as 
well as in a case of continuous value data, a square 
distance, a Hellinger distance, a logarithmic loss, etc. 

10 can be used. In the histogram, a probability value p (t) (x) 

of data x to be stored in a j-th cell at a time t is 
calculated as follows: 

p (t) (x) = p^tjJ/Lj 
In the expression, Lj denotes a number of points to be 

15 stored in the j-th cell and p (t> ( j ) denotes a probability 

value of the j-th cell at the time t. Using the equation, 
the square distance ds(p (t) , p ct_1) ) and the Hellinger 
distance dh (p (t) , p (t_1> ) are calculated according to the 
following expressions, respectively: 

20 

d B (p (t) ,p (t - l) ) ^(p^W-p^x)) 2 , 

a h ( P (t) , P (t - i) ) = f ^ ( Vi^oo - Vp (t - 1} (*)) 2 

For the score calculation device 62 to conduct 
these calculations, the degree of outlier calculation 
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device should be set to receive parameter values of p ct) 
and p (t_1) from the histogram calculation device 61. In 
addition, a logarithmic loss for input data x t at a time 
t is calculated by the following expression: 
5 - log p <t_1) (x t ) 

The foregoing scores mean a change of an 
estimated distribution measured as a statistical 
distance or a logarithmic loss for an estimated 

10 distribution of input data and either case their 

statistical significance is unclear. 

Fig. 13 is a diagram showing an entire structure 
of a degree of outlier calculation device according to a 
further embodiment of the present invention which 

15 employs the normal mixture density estimation device 

illustrated in Fig. 1 and the histogram calculation 
device illustrated in Fig. 9, while Fig. 14 is a flow 
chart showing operation thereof. Input data described 
both in a discrete value and a continuous value is 

20 sequentially input to a histogram calculation device 71, 

a cell determination device 73 and a score calculation 
device 74 (Step S71). Connected to the cell 
determination device 73 are a number N of probability 
density calculation devices 721 to 72N for a normal 

25 mixture. Here, N denotes the number of cells in the 

histogram of the histogram calculation device 71. To all 
the probability density calculation devices 721 to 72N 
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and the histogram calculation device 71, the score 
calculation device 74 is connected. 

The histogram calculation device 71 calculates a 
parameter of the histogram only from a discrete data 
5 part of the input data (Step S72) and sends the same to 

the score calculation device 74. The cell determination 
device 73 determines to which cell of the histogram the 
discrete data part of the input data belongs (Step S73) 
and to the corresponding probability density estimation 

10 device, sends a continuous data part. 

The probability density calculation devices 721 
to 72N calculate a parameter of the probability density 
only when receiving the input data sent in (Step S74) 
and sends the parameter to the score calculation device 

15 74. The score calculation device 74 calculates a score 

of the original input data with the input data, the 
output from the histogram calculation device 71 and any 
one of the outputs from the probability density 
calculation devices 721 to 72N as inputs (Step S75) and 

20 takes the score as an output (Step S76). 

The score calculation device 74 calculates a 
score, for example, as a degree of a change in a 
probability distribution measured by a Hellinger 
distance or as a negative logarithmic likelihood 

25 (logarithmic loss) of a probability distribution with 

respect to input data. Denote a vector made up of 
categorical variables as x and a vector made up of 
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continuous variables as y. A simultaneous distribution 
of x and y will be expressed as follows: 

p(x, Y) = P (x) P (y|x) 
In the expression, p(x) represents a probability 
distribution of x which is expressed by a histogram 
density. p(y|x) represents a conditional probability 
distribution of y with x being applied. This is provided 
for each divisional region. With respect to new input 
data Dt = (x t , y t ), a Hellinger distance is calculated in 
the following manner. 



= 2-2^ ^/p (t) (x)p (t - 1) (x)J^p (t) (y|x)p (t - 1) (y| X ) dy 

These are immediately generalized into a distance 
between p (t) and p (t_T) , with T as a positive integer. 

In addition, a logarithmic loss is calculated 
according to the following expression: 

-logp^Cx^-logp^Cytlx,) 

Fig. 15 is a diagram showing an entire structure 
of a degree of outlier calculation device according to 
the present invention which employs the kernel mixture 
distribution probability density estimation device 
illustrated in Fig. 5 and the histogram calculation 
device illustrated in Fig. 9, while Fig. 16 is a flow 
chart showing operation thereof. Input data described 
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both in a discrete value and a continuous value the 
stored parameter values is sequentially input to a 
histogram calculation device 81, a cell determination 
device 83 and a score calculation device 84 (Step S81). 
5 To the cell determination device 83, a number N of 

probability density calculation devices 821 to 82N for a 
kernel mixture distribution are connected. Here, N 
denotes the number of cells in the histogram of the 
histogram calculation device 81. 
10 To all the probability density calculation 

devices 821 to 82N and the histogram calculation device 
81, the score calculation device 84 is connected. The 
histogram calculation device 81 calculates a parameter 
of the histogram only from a discrete data part of the 
15 input data (Step S82) and sends the same to the score 

calculation device 84. The cell determination device 83 
determines to which cell of the histogram the discrete 
data part of the input data belongs (Step S83) and to 
the corresponding probability density estimation device, 
20 sends a continuous data part. The probability density 

calculation devices 821 to 82N calculate a parameter of 
the probability density only when receiving the input 
data sent in (Step S84) and sends the parameter to the 
score calculation device 84 (Step S85). 
25 The score calculation device 84 calculates a 

score of the original input data with the input data, 
the output from the histogram calculation device 81 and 
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any one of the outputs from the probability density 
calculation devices 821 to 82N as inputs and takes the 
score as an output (Step S86). The score calculation 
method is the same as that of the degree of outlier 
5 calculation device shown in Fig. 13. 

Although the invention has been illustrated and 
described with respect to exemplary embodiment thereof, 
it should be understood by those skilled in the art that 
the foregoing and various other changes, omissions and 

10 additions may be made therein and thereto, without 

departing from the spirit and scope of the present 
invention. Therefore, the present invention should not 
be understood as limited to the specific embodiment set 
out above but to include all possible embodiments which 

15 can be embodies within a scope encompassed and 

equivalents thereof with respect to the feature set out 
in the appended claims . 
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WHAT IS CT.ATMED IS: 

1. For use in a degree of outlier calculation device 

for sequentially calculating a degree of outlier of each 
data with a data sequence of real vector values as input, 
a probability density estimation device for, while 

5 sequentially reading said data sequence, estimating a 

probability distribution of generation of the data in 
question by using a finite mixture distribution of 
normal distributions, comprising: 

probability calculation means for calculating, 

10 based on a value of input data and values of a mean 

parameter and a variance parameter of each of a finite 
number of normal distribution densities, a probability 
of generation of the input data in question from each 
normal distribution; and 

15 parameter rewriting means for updating and 

rewriting the stored parameter values while forgetting 
past data, according to newly read data based on a 
probability obtained by the probability calculation 
means, values of a mean parameter and a variance 

20 parameter of each normal distribution and a weighting 

parameter of each normal distribution. 



2. The probability density estimation device as set 

forth in claim 1, further comprising 

parameter storage means for storing values of a 
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mean parameter and a variance parameter of each of a 
finite number of normal distribution densities and a 
weighting parameter of each normal distribution, wherein 

said parameter rewriting means updates and 
rewrites data of said parameter storage means . 

3 . A degree of outlier calculation device for 

sequentially calculating a degree of outlier of each 
data with a data sequence of real vector values as input, 
comprising: 

a probability density estimation device for, 
while sequentially reading said data sequence, 
estimating a probability distribution of generation of 
the data in question by using a finite mixture of normal 
distributions including 

(a) parameter storage means for storing values of 
a mean parameter and a variance parameter of each of a 
finite number of normal distribution densities and a 
weighting parameter of each normal distribution, 

(b) probability calculation means for calculating, 
based on a value of input data and values of a mean 
parameter and a variance parameter of each of a finite 
number of normal distribution densities, a probability 

of generation of the input data in question from each 
normal distribution, and 

(c) parameter rewriting means for updating and 
rewriting the stored parameter values while forgetting 
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past data, according to newly read data based on a 
probability obtained by the probability calculation 
means, values of a mean parameter and a variance 
parameter of each normal distribution and a weighting 
parameter of each normal distribution, and 

degree of outlier calculation means for 
calculating and outputting a degree of outlier of said 
data by using a parameter of the normal mixture updated 
by said probability density estimation device and based 
on a probability distribution estimated from values of 
the parameters before and after the updating and the 
input data. 

4. A probability density estimation device for use 

in a degree of outlier calculation device to, while 
sequentially reading a data sequence, estimate a 
probability distribution of generation of the data in 
question by using a finite number of normal kernel 
distributions, comprising: 

parameter storage means for storing a value of a 
parameter indicative of a position of each kernel, and 

parameter rewriting means for reading a value of 
a parameter from the storage means and updating the 
stored parameter values while forgetting past data, 
according to newly read data to rewrite the contents of 
the parameter storage means. 
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5. A degree of outlier calculation device for 

sequentially calculating a degree of outlier of each 
data with a data sequence of real vector values as input, 
comprising: 

a probability density estimation device for, 
while sequentially reading said data sequence, 
estimating a probability distribution of generation of 
the data in question by using a finite number of normal 
kernel distributions including 

(a) parameter storage means for storing a value 
of a parameter indicative of a position of each kernel, 
and 

(b) parameter rewriting means for reading a value 
of a parameter from the storage means and updating the 
stored parameter values while forgetting past data, 
according to newly read data to rewrite the contents of 
the parameter storage means, and 

degree of outlier calculation means for 
calculating and outputting a degree of outlier of said 
data by using said parameter updated by said probability 
density estimation device and based on a probability 
distribution estimated from values of the parameters 
before and after the updating and the input data. 

6. For use in a degree of outlier calculation device 

for sequentially calculating a degree of outlier of each 
data with discrete value data as input, a histogram 



calculation device for calculating a parameter of a 
histogram with respect to said discrete value data 
sequentially input, comprising: 

storage means for storing a parameter value of 
said histogram, and 

parameter updating means for reading said 
parameter value from the storage means and updating past 
parameter values while forgetting past data based on 
input data to rewrite the value of said storage means, 
thereby outputting some of parameter values of said 
storage means . 

7 . A degree of outlier calculation device for 

sequentially calculating a degree of outlier of each 
data with discrete value data as input, comprising: 

a histogram calculation device for calculating a 
parameter of a histogram with respect to said discrete 
value data sequentially input including 

storage means for storing a parameter value of 
said histogram, and 

parameter updating means for reading said 
parameter value from the storage means and updating past 
parameter values while forgetting past data based on 
input data to rewrite the value of said storage means, 
thereby outputting some of parameter values of said 
storage means, and 

score calculation means for calculating, based on 
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the output of the histogram calculation device and said 
input data, a score of the input data in question with 
respect to said histogram, thereby outputting the output 
of the score calculation means as a degree of outlier of 
20 said input data. 

8 . A degree of outlier calculation device for 

calculating a degree of outlier with respect to 
sequentially input data which is described both in a 
discrete value and a continuous value , comprising: 
5 a histogram calculation device for estimating a 

histogram with respect to a discrete value data part, 

probability density estimation devices provided 
as many as the number of cells of said histogram for 
estimating a probability density with respect to a 
10 continuous value data part, 

cell determination means for determining to which 
cell of said histogram said discrete value data part 
belongs to send the continuous data part to the 
corresponding one of said probability density estimation 
15 devices, and 

score calculation means for calculating a score 
of said input data based on a probability distribution 
estimated from output values of said histogram 
calculation device and said probability density 
20 estimation device and said input data, thereby 

outputting the output of the score calculation 
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means as a degree of outlier of said input data, 

said histogram calculation device including 
storage means for storing a parameter value of 
said histogram, and 

parameter updating means for reading said 
parameter value from the storage means and updating past 
parameter values while forgetting past data based on 
input data to rewrite the value of said storage means, 
thereby outputting some of parameter values of said 
storage means, and 

said probability density estimation device 
including 

parameter storage means for storing values of a 
mean parameter and a variance parameter of each of a 
finite number of normal distribution densities and a 
weighting parameter of each normal distribution, 

probability calculation means for calculating, 
based on a value of input data, and values of a mean 
parameter and a variance parameter of each of a finite 
number of normal distribution densities, a probability 
of generation of the input data in question from each 
normal distribution, and 

parameter rewriting means for updating and 
rewriting the stored parameter values while forgetting 
past data, according to newly read data based on a 
probability obtained by the probability calculation 
means, values of a mean parameter and a variance 
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parameter of each normal distribution and a weighting 
50 parameter of each normal distribution. 

9. a degree of outlier calculation device for 

calculating a degree of outlier with respect to 
sequentially input data which is described both in a 
discrete value and a continuous value , comprising: 
5 a histogram calculation device for estimating a 

histogram with respect to said discrete value data part, 

probability density estimation devices provided 
as many as the number of cells of said histogram for 
estimating a probability density with respect to a 
10 continuous value data part, 

cell determination means for determining to which 
cell of the histogram said discrete value data part 
belongs to send the continuous data part to the 
corresponding one of said probability density estimation 
15 devices, and 

score calculation means for calculating a score 
of said input data based on a probability distribution 
estimated from output values of said histogram 
calculation device and said probability density 
20 estimation device and said input data, thereby 

outputting the output of the score calculation 
means as a degree of outlier of said input data, 

said histogram calculation device including 
storage means for storing a parameter value of 
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25 said histogram, and 

parameter updating means for reading said 
parameter value from the storage means and updating past 
parameter values while forgetting past data based on 
input data to rewrite the value of said storage means, 
30 thereby outputting some of parameter values of said 

storage means , and 

said probability density estimation device 
including 

parameter storage means for storing a value of a 
35 parameter indicative of a position of each kernel, and 

parameter rewriting means for reading a value of 
a parameter from the storage means and updating the 
stored parameter values while forgetting past data, 
according to newly read data to rewrite the contents of 
40 the parameter storage means. 

10. For use in a degree of outlier calculation device 

for sequentially calculating a degree of outlier of each 
data with a data sequence of real vector values as input, 
a probability density estimation method of, while 
5 sequentially reading said data sequence, estimating a 

probability distribution of generation of the data in 
question by using a finite mixture of normal 
distributions, comprising the steps of: 

based on values of a mean parameter and a 
10 variance parameter of each of a finite number of normal 
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distribution densities read from parameter storage means 
for storing a value of input data, values of a mean 
parameter and a variance parameter of each of a finite 
number of normal distribution densities, and a weighting 

15 parameter of each normal distribution, calculating a 

probability of generation of the input data in question 
from each normal distribution, and 

updating the stored parameter values while 
forgetting past data, according to newly read data based 

20 on a probability obtained by the probability calculation 

means, values of a mean parameter and a variance 
parameter of each normal distribution and a weighting 
parameter of each normal distribution to rewrite data of 
said parameter storage means. 

25 

11. A degree of outlier calculation method of 

sequentially calculating a degree of outlier of each 
data, with a data sequence of real vector values as 
input, wherein 

5 probability density estimation for, while 

sequentially reading said data sequence, estimating a 
probability distribution of generation of the data in 
question by using a finite mixture of normal 
distributions, comprises the steps of: 
10 based on values of a mean parameter and a 

variance parameter of each of a finite number of normal 
distribution densities read from parameter storage means 
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for storing a value of input data, values of a mean 
parameter and a variance parameter of each of a finite 
number of normal distribution densities, and a weighting 
parameter of each normal distribution, calculating a 
probability of generation of the input data in question 
from each normal distribution, and 

updating the stored parameter values while 
forgetting past data, according to newly read data based 
on a probability obtained by the probability calculation 
means, values of a mean parameter and a variance 
parameter of each normal distribution and a weighting 
parameter of each normal distribution to rewrite data of 
said parameter storage means, and which further 
comprises the step of : 

calculating and outputting a degree of outlier of 
said data by using a parameter of the finite mixture 
distribution updated by said probability density 
estimation and based on a probability distribution 
estimated from values of the parameters before and after 
the updating and the input data. 

12. A probability density estimation method for use 

in calculation of a degree of outlier to, while 
sequentially reading a data sequence, estimate a 
probability distribution of generation of the data in 
question by using a finite number of normal kernel 
distributions, comprising the steps of: 
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storing a value of a parameter indicative of a 
position of each kernel in parameter storage means, and 

reading a value of a parameter from the storage 
10 means and updating the stored parameter values while 

forgetting past data, according to newly read data to 
rewrite the contents of the parameter storage means. 

13. A degree of outlier calculation method of 

sequentially calculating a degree of outlier of each 
data, with a data sequence of real vector values as 
input, wherein 
5 probability density estimation for, while 

sequentially reading said data sequence, estimating a 
probability distribution of generation of the data in 
question by using a finite number of normal kernel 
distributions comprises the steps of: 

10 storing a value of a parameter indicative of a 

position of each kernel in parameter storage means, 

reading a value of a parameter from the storage 
means and updating the stored parameter values while 
forgetting past data, according to newly read data to 

15 rewrite the contents of the parameter storage means, and 

which further comprises: 

degree of outlier calculation means for 
calculating and outputting a degree of outlier of said 
data by using said parameter updated by said probability 

20 density estimation and based on a probability 
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distribution estimated from values of the parameters 
before and after the updating and the input data. 

14. For use in calculation of a degree of outlier for 
sequentially calculating a degree of outlier of each 
data with discrete value data as input, a histogram 
calculation method of calculating a parameter of a 

5 histogram with respect to said discrete value data 

sequentially input, comprising the steps of: 

reading said parameter value from storage means 
for storing a parameter value of said histogram and 
updating past parameter values while forgetting past 
10 data based on input data to rewrite the value of said 

storage means, and 

outputting some of parameter values of said 
storage means. 

15. A degree of outlier calculation device for 
sequentially calculating a degree of outlier of each 
data with discrete value data as input, comprising: 

a histogram calculation device for calculating a 
5 parameter of a histogram with respect to said discrete 

value data sequentially input including 

storage means for storing a parameter value of 
said histogram, and 

parameter updating means for reading said 
10 parameter value from the storage means and updating past 
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parameter values while forgetting past data based on 
input data to rewrite the value of said storage means, 
thereby outputting some of parameter values of said 
storage means, and 

15 score calculation means for calculating, based on 

the output of the histogram calculation device and said 
input data, a score of the input data in question with 
respect to said histogram, thereby outputting the score 
calculation result as a degree of outlier of said input 

20 data. 

16. A degree of outlier calculation method of 

calculating a degree of outlier with respect to 
sequentially input data which is described both in a 
discrete value and a continuous value , wherein 
5 histogram calculation which estimates a histogram 

with respect to a discrete value data part comprises the 
steps of: 

reading said parameter value from storage means 
for storing a parameter value of said histogram and 
10 updating past parameter values while forgetting past 

data based on input data to rewrite the value of said 
storage means, and 

outputting some of parameter values of said 
storage means, and wherein 
15 in probability density estimation devices 

provided as many as the number of cells of said 
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histogram for estimating a probability density with 
respect to a continuous value data part, said method 
comprises the steps of: 

based on values of a mean parameter and a 
variance parameter of each of a finite number of normal 
distribution densities read from parameter storage means 
for storing a value of input data, values of a mean 
parameter and variance parameter of each of a finite 
number of normal distribution densities and a weighting 
parameter of each normal distribution, calculating a 
probability of generation of the input data in question 
from each normal distribution, and 

based on a probability obtained by the 
probability calculation means, values of a mean 
parameter and a variance parameter of each normal 
distribution and a weighting parameter of each normal 
distribution, updating the stored parameter values while 
forgetting past data, according to newly read data to 
rewrite the data of said parameter storage means, and 
wherein said method further comprises the steps of: 

determining to which cell of said histogram said 
discrete value data part belongs to send the continuous 
data part to the corresponding one of said probability 
density estimation devices, 

calculating a score of said input data based on a 
probability distribution estimated from output values of 
said histogram calculation device and said probability 
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density estimation device and said input data, and 
45 outputting the score calculation result as a 

degree of outlier of said input data. 
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ABSTRACT OF THE DISCLOSURE 
Degree of outlier of one input data is calculated 
by an amount of change in a learned probability density 
from that before learning as a result of taking in of 
5 the input data. This is because data largely differing 

in a tendency from a so far learned probability density 
function can be considered to have a high degree of 
outlier. More specifically, a function of a distance 
between probability densities before and after data 

10 input is calculated as a degree of outlier. Accordingly, 

a probability density estimation device appropriately 
estimates a probability distribution of generation of 
unfair data while sequentially reading a large volume of 
data and a score calculation device calculates and 

15 outputs a degree of outlier of each data based on the 

estimated probability distribution. 
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